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Abstract. In this paper, we claim that language is likely to have emerged 
as a mechanism for coordinating the solution of complex tasks. To con- 
firm this thesis, computer simulations are performed based on the coordi- 
nation task presented by Garrod & Anderson (1987). The role of success 
in task-oriented dialogue is analytically evaluated with the help of per- 
formance measurements and a thorough lexical analysis of the emergent 
communication system. Simulation results confirm a strong effect of suc- 
cess mattering on both reliability and dispersion of linguistic conventions. 



1 Introduction 



j^ In the last decade, the field of communication science has seen a major increase 

in the number of research programmes that go beyond the more conventional 
studies of human dialogue (e.g. [617] ) in an attempt to reproduce the emergence 
of conventionalized communication systems in a laboratory (e.g. [418110] ). In 
his seminal paper, Galantucci has proposed to refer to this line of research as 
experimental semiotics, which he sees as a more general form of experimental 
I/"") \ pragmatics. In particular, Galantucci defines that the former "studies the emer- 

gence of new forms of communication" , while the latter "studies the spontaneous 
£C) \ use of pre-existing forms of communication" (p. 394, [5])- 

^^ ■ Experimental semiotics provides a novel way of reproducing the emergence 

of a conventionalized communication system under laboratory conditions. How- 
ever, the findings from this field cannot be transferred to the question of primeval 
emergence of language without the caveat that the subjects of the present-day 
experiments are very much familiar with the concepts of conventions and com- 
munication systems (even if they are not allowed to employ any existing versions 
C$ ■ of these in the conducted experiments) , while our ancestors who somehow man- 

aged to invent the very first conventionalized signaling system, by definition, 
could not have been aware of these concepts. Since experimental semiotics re- 
searchers cannot adjust the minds of their subjects in order to find out how they 
could discover the concept of a communication system, the most these exper- 
iments can realistically achieve is make the subjects signal the 'signalhood' of 
some novel form of communication (see. [13] )■ To go any further seems at least 
for now to require the use of computer models and simulations. 



2 Martin Bachwerk and Carl Vogel 

Consequently, we are interested in how a community of simulated agents can 
agree on a set of lexical conventions with a very limited amount of given knowl- 
edge about the notion of a communication system. In this particular paper, we 
address this issue by conducting several computer simulations that are meant to 
reconstruct the human experiments conducted by [B] and [7] , which suggest that 
the establishment of new conventions requires for at least some understanding to 
be experienced, for example measured in the success of the action performed in 
response to an utterance, and that differently organized communities can come 
up with variously effective communication systems. While the communities in 
the current experiments are in a way similar to the social structures implemented 
in [T] , the focus here is on local coordination and the role of task- related commu- 
nicative success, rather than the effect of different higher-order group structures. 

2 Modelling Approach 

The experiments presented in this paper have been performed with the help of 
the Language Evolution Workbench (LEW) (see |16llj for more detailed descrip- 
tions of the model). This workbench provides over 20 adjustable parameters and 
makes as few assumptions about the agents' cognitive skills and their aware- 
ness of the possibility of a conventionalized communication system as possible. 
The few cognitive skills that are assumed can be considered as widely accepted 
(see [11114) among others) as the minimal prerequisites for the emergence of 
language. These skills include the ability to observe and individuate events, the 
ability to engage in a joint attention frame fixed on an occurring event, and the 
ability to interact by constructing words and utterances from abstract symbol^ 
and transmitting these to one's interlocutor|x| During such interactions, one of 
the agents is assigned the intention to comment on the event, while a second 
agent assumes that the topic of the utterance relates in some way to the event 
and attempts to decode the meaning of the encountered symbols accordingly. 

From an evolutionary point of view, the LEW fits in with the so called 
faculty of language in the narrow sense as proposed by [5] in that the agents 
are equipped with the sensory, intentional and concept-mapping skills at the 
start, and the simulations attempt to provide an insight into how these could be 
combined to produce a communication system with comparable properties to a 
human language. From a pragmatics point of view, our approach directly adopts 
the claim made by |12) that dialogue is the underlying form of communication. 
Furthermore, despite the agents in the LEW lacking any kind of embodiment, 
they are designed in a way that makes each agent individuate events according to 



1 While we often refer to such symbols as 'phonemes' throughout the paper, there 
is no reason why these should not be representative of gestural signs. 

2 Phenomena such as noise and loss of data during signal transmission are ignored 
in our approach for the sake of simplicity. 

3 It is important to stress out that hearers are not assumed to know the word bound- 
aries of an encountered utterance. However, simulations with so called synchronized 
transmission have been performed previously by [15| . 
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its own perspective, which in most cases results in their situation models being 
initially non-aligned, thus providing the agents with the task of aligning their 
representations, similarly to the account presented in [12] . 

3 Experiment Design 

In the presented experiments, we aim to reproduce the two studies originally 
performed by Garrod and his colleagues, but in an evolutionary simulation per- 
formed on an abstract model of communication. Our reconstruction lies in the 
context of a simulated dynamic system of agents which should provide us with 
some insights about how Garrod's findings can be transferred to the domain of 
language evolution. The remainder of this section outlines the configuration of 
the LEW used in the present study, together with an explanation of the three 
manipulated parameters. The results of the corresponding simulations are then 
evaluated in the following section El with special emphasis being put on the 
communicative potential and general linguistic properties of the emergent com- 
munication systems|_| 

Garrod observed in his two studies that conventions have a better chance 
of getting established and reused if their utilisation appears to lead to one's 
interlocutor understanding of one's utterance, either by explicitly signaling so 
or by performing an adequate action. Notably, in task-based communication, 
interlocutors may succeed in achieving a task with or without complete mutual 
understanding of the surrounding dialogue. Nevertheless, our simulations have 
been focussed on a parameter of the LEW that defines the probability that com- 
municative success matters p sm in an interaction. From an evolutionary point of 
view, this parameter is motivated by the numerous theories that put coopera- 
tion and survival as the core function of communication (e.g. [2]). However, the 
abstract implementation of the parameter allows us to refrain from selecting any 
particular evolutionary theory as the target one by generalizing over all kinds 
of possible success that may result from a communication bout, e.g. avoiding a 
predator, hunting down a prey or battling off a rival gang. 

The levels of the parameter that defines if success matters were varied be- 
tween and 1 (in steps of 0.25) in the presented simulations. To clarify the se- 
lected values of the parameter, p sm —0 means that communicative success plays 
no role whatsoever in the system and p sm =l means that only interactions sat- 
isfying a minimum success threshold will be remembered by the agents. The 
minimum success threshold is established by an additional parameter of the 
LEW and can be generally interpreted as the minimum amount of information 
that needs to be extracted by the hearer from an encountered utterance in order 
to be of any use. In our experiments, we have varied between a minimum success 



4 We intentionally refrain from referring to the syntax-less communication systems 
that emerge in our simulations as 'language' as that would be seen as highly contentious 
by many readers. Furthermore, even though the term 'protolanguage' appears to be 
quite suited for our needs (cf. [H]), the controversial nature of that term does not 
really encourage its use either, prompting us to stick to more neutral expressions. 
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threshold of 0.25 and 1 (in steps of 0.25)0 The effects of this parameter will not 
be reported in this paper due to a lack of significance and space limitations. 

In addition to the above two parameters, the presented experiments also 
introduce two different interlocutor arrangements, similar to the studies in [6] 
and 7\ . In the first of these, pairs of agents are partnered with each other for the 
whole duration of the simulation, meaning that they do not converse with any 
other agents at all. The second arrangement emulates the community setting 
introduced in [7] by successively alternating the pairings of agents, in our case 
after every 100 interaction 'epochs' The introduction of the community setting 
was motivated by the hypothesis that a community of agents should be able 
to engage in a global coordination process, as opposed to local entrainment, 
resulting in more generalized and thus eventually more reliable conventions. 

4 Results and Discussion 

The experimental setup described above resulted in 34 different parameter com- 
binations, for each of which 600 independent runs have been performed in order 
to obtain empirically reliable data. The evaluation of the data has been per- 
formed with the help of a number of measures that have been selected with the 
goal of being able to describe both the communicative usefulness of an evolved 
convention system, as well as compare its main properties to those of languages 
as we know them now (see pQ for a more detailed account). 

In order to understand how well a communication system performs in a 
simulation, it is common to observe the understanding precision and recall rates, 
which can be combined to a single F- measure (F 1 = 2 * pre " s '°"* reca " ). As can 

° V precision+recaU ' 

be seen from Figure l(a)| the results suggest that having a higher p sm has 



a direct effect on the understanding rates of a community (t value between 
26.68 and 210.63, p < 0.0001). However, a communication setup in which agents 
communicate with each other in turns as opposed to with a fixed partner does 
not appear to be advantageous for the establishment of a reliable means of 



communication (t — —15.85, p < 0.0001). Looking further, Figure 1(b) indicates 
that, just as observed in |JJ, agents operating in a community have a larger 
amount of variation available to them, in our case in the form of a larger lexicon 
(t = 35.52, p < 0.0001). However, unlike in the empirical study, the agents in the 
LEW do not benefit from this property, among other things due to the lack of an 
ability to enter into a negotiation about conventions to use in a given context. 
It is important to note at this stage that the understanding measure presented 



in Figure 1(a) only takes into account the interactions that have been successful, 
i.e. were not below the minimum success threshold in cases where success was 
chosen to matter. Consequently, this figure does not tell us how well the agents' 
lexicons are actually equipped to interpret a wide range of utterances. In order 



5 Setting the minimum success threshold to is equivalent to having p sm = 0. 

6 In both cases, the agent population was set to ten and so each 'epoch' comprised 
ten interactions, whereby every agent would on average take part in two interactions: 
once as a speaker and once as a hearer. 
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Fig. 1. Effect of the interaction type and the probability that success matters on (a) 
communicative success and (b) agent lexicon size. 



to evaluate the lexicons of agents without any effect that simple guessing luck 
might have on understanding, we take a look at two further measures: lexicon 
use, i.e. the average ratio of forms of an utterance that the hearer agent was able 
to find in his lexicon, and lexicon precision, i.e. the ratio of correct meanings 
found by the hearer, in the cases where the agent used his lexicon for decoding 
a form. Furthermore, the decrease in lexicon size alone does not provide any 
specific information as to what exactly is happening to the agents' lexicons. In 
other terms, further measures are required that could explain what effect the 
decrease actually has on the expressive and interpretative potential of a lexicon. 
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Fig. 2. Effect of the interaction type and the probability that success matters on (a) 
lexicon use and (b) lexicon precision. 



Figure 2(a) depicts the rates of lexicon use, suggesting that with the increase 
of p sm and the corresponding diminishing of lexicon size (t value between —40.06 
and —75.26, p < 0.0001), the number of forms in an agent's lexicon appears to 
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Fig. 3. Effect of the interaction type and the probability that success matters on the 



number of (a) unique meanings and (b) unique forms in agent lexicons 



decrease (t value between —39.81 and —78.23, p < 0.0001) with a significant 
effect on lexicon use (t value between —4.57 and —20.38, p < 0.0001), as further 
confirmed by Figure |3(b)| The intuition is that for higher levels of p sm , wrongly 
guessed meanings are not being recorded in the agents' lexicons, resulting in 
higher quality convention systems. This is confirmed by the increase in lexicon 



precision (t value between 11.63 and 101.64, p < 0.0001) depicted in Figure 2(b) 



Interestingly enough, the decrease in the number of different forms in agents' 
lexicons does not seem to have a significant effect on agent lexicon synonymy 
across the board (p > 0.1 for p sm = 0.25; yet t value between —4.64 and —28.18, 
p < 0.0001 for higher levels of p sm ) (see Figure 4(a)). Presumably, the reason 



for this is that the drop-off in the number of distinct meanings (see Figure 3(a) ) 



is directly proportional to that of distinct forms, which would explain the less 
affected synonymy and homonymy ratios (see Figure|4(b)|for a plot of the latter) . 
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Fig. 4. Effect of the interaction type and the probability that success matters on|(a) 



agent lexicon synonymy and (b) agent lexicon homonymy. 
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Fig. 5. Effect of the interaction type ( (a) only) and the probability that success matters 
on (a) average mapping share and (b) ratio of mappings shared by exactly X agents. 



So far we have only evaluated the results of our experiments from the point 
of view of the agents, either by looking at their observed interaction success or 
by evaluating the communicative potential of their lexicons. However, as one 
of the main topics of the presented study was the establishment of conventions 
in a community of interlocutors, we should also evaluate the simulation results 
from the point of view of conventions, i.e. meaning-form mappings. In fact, there 
is a significant effect of both the community setting (t value between 3.58 and 
91.14, p < 0.00035) and success mattering (t = 86.64, p < 0.0001) o n the 
number of agents that share a mapping on average, as depicted in Figure [5(a) 



This effect is broken down in Figure |5(b)| in which one can see the portion 
of the global lexicon that is shared by any particular number of agents|j The 
effects observed in the latter figure can be further described by an equation like 
mapshare = a Psm * rj~™, whereby the ratio of shared mappings (mapshare) is 
directly proportional to success mattering (j> sm ) and inversely proportional to 
the number of agents (n) that are expected to know the mappings. 



5 Conclusions and Future Work 

In summary, experiencing a degree of success provides the all important foun- 
dation required for establishing linguistic conventions in task-oriented dialogue 
and dispersing these throughout the community. The ramifications of this finding 
are that language is very unlikely to have emerged for the benefit of a success- 
agnostic activity, such as gossip (cf. [3]), but has presumably evolved as an 
adaptational necessity in times where human cooperation has become essential. 
The shortcomings of the community setting can be attributed to the LEW's 
implementation of interactions as two autonomous activities and the lack of 
success-based adjustment of mapping usage strategies. Future work should aim 
to improve this aspect by looking into the interactive alignment model (cf. |12j). 



7 The remainder of the mappings is not shared, i.e. known by only one agent. 
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